Hidden feature models for speech recognition using dynamic Bayesian networks
نویسندگان
چکیده
In this paper, we investigate the use of dynamic Bayesian networks (DBNs) to explicitly represent models of hidden features, such as articulatory or other phonological features, for automatic speech recognition. In previous work using the idea of hidden features, the representation has typically been implicit, relying on a single hidden state to represent a combination of features. We present a class of DBN-based hidden feature models, and show that such a representation can be not only more expressive but also more parsimonious. We also describe a way of representing the acoustic observation model with fewer distributions using a product of models, each corresponding to a subset of the features. Finally, we describe our recent experiments using hidden feature models on the Aurora 2.0 corpus.
منابع مشابه
Investigating Mixed Discrete/Continuous Dynamic Bayesian Networks with Application to Automatic Speech Recognition
Notation s t The state of a discrete (switch) hidden variable at time t h t The state of a continuous hidden variable at time t o t A feature vector at time t v t A sample of the speech signal at time t x 1:T Shorthand for x 1 , x 2 ,. .. , x T φ A particular setting of the HMM parameters
متن کاملHidden factor dynamic Bayesian networks for speech recognition
This paper presents a novel approach to modeling speech data by Dynamic Bayesian Networks. Instead of defining a specific set of factors that affect speech signals the factors are modeled implicitly by speech data clustering. Different data clusters correspond to different subsets of the factor values. These subsets are represented by the corresponding factor states. The factor states along wit...
متن کاملDynamic Bayesian Networks for Multi-Dialect Isolated Arabic Recognition
Hidden Markov Models (HMM) are currently widely used in Automatic Speech Recognition (ASR) as being the most effective models. In addition, the HMM are just a special case of graphical models which are dynamic Bayesian Networks (DBN). These are modeling tools more sophisticated because they allow to include several specific variables in the problem of automatic speech recognition other than the...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کامل